Developing a discrimination rule between breast cancer patients and controls using proteomics mass spectrometric data: a three-step approach.

نویسندگان

  • A Geert Heidema
  • Nico Nagelkerke
چکیده

To discriminate between breast cancer patients and controls, we used a three-step approach to obtain our decision rule. First, we ranked the mass/charge values using random forests, because it generates importance indices that take possible interactions into account. We observed that the top ranked variables consisted of highly correlated contiguous mass/charge values, which were grouped in the second step into new variables. Finally, these newly created variables were used as predictors to find a suitable discrimination rule. In this last step, we compared three different methods, namely Classification and Regression Tree (CART), logistic regression and penalized logistic regression. Logistic regression and penalized logistic regression performed equally well and both had a higher classification accuracy than CART. The model obtained with penalized logistic regression was chosen as we hypothesized that this model would provide a better classification accuracy in the validation set. The solution had a good performance on the training set with a classification accuracy of 86.3%, and a sensitivity and specificity of 86.8% and 85.7%, respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DIAGNOSIS OF BREAST LESIONS USING THE LOCAL CHAN-VESE MODEL, HIERARCHICAL FUZZY PARTITIONING AND FUZZY DECISION TREE INDUCTION

Breast cancer is one of the leading causes of death among women. Mammography remains today the best technology to detect breast cancer, early and efficiently, to distinguish between benign and malignant diseases. Several techniques in image processing and analysis have been developed to address this problem. In this paper, we propose a new solution to the problem of computer aided detection and...

متن کامل

تأثیر عوامل مربوط به باروری بر خطر بروز سرطان پستان؛ یک مطالعه مورد - شاهد

Background & Objectives: Breast cancer is a common malignancy in women in many parts of the world. The incidence of breast cancer in Iranian women is growing. Iranian patients are relatively younger than their western counterparts. We conducted a case-control study to determine roles of reproductive factors for breast cancer among women in Iran. Methods: A hospital based case-control study was ...

متن کامل

Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer.

BACKGROUND Surface-enhanced laser desorption/ionization (SELDI) is an affinity-based mass spectrometric method in which proteins of interest are selectively adsorbed to a chemically modified surface on a biochip, whereas impurities are removed by washing with buffer. This technology allows sensitive and high-throughput protein profiling of complex biological specimens. METHODS We screened for...

متن کامل

The Association of the MTHFR Gene Polymorphisms with Breast Cancer Susceptibility

Introduction: Breast cancer is the most common malignancy in women worldwide. It is also the second leading cause of cancer death among women after lung cancer. Considering the relationship among plasma folate levels, the level of uracil, and DNA damage in cell division, methyl tetrahydrofolate reductase (MTHFR) is a suitable candidate for studies on the susceptibility to cancer, including brea...

متن کامل

Study on the association of Epstein - Barr virus with breast cancer in Khorramabad breast cancer patients, Iran

Background: Breast cancer is one of the most common malignancies in the world, and early diagnosis of this cancer is a key factor in its treatment. This cancer is a multi-stage disease, in which viruses can play a role. EBV is known as an important factor in the development of some human cancers. Therefore, this study was conducted to determine the relationship between Epstein-Barr virus, EBV, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Statistical applications in genetics and molecular biology

دوره 7 2  شماره 

صفحات  -

تاریخ انتشار 2008